Understanding Three Simultaneous Speeches
نویسندگان
چکیده
Understanding three simultaneous speeches is proposed as a challenge problem to foster artificial intelligence, speech and sound understanding or recognition, and computational auditory scene analysis research. Automatic speech recognition under noisy environments is attacked by speech enhancement techniques such as noise reduction and speaker adaptation. However, the signal-to-noise ratio of speech in two simultaneous speeches is too poor to apply these techniques. Therefore, novel techniques need to be developed. One candidate is to use speech stream segregation as a front-end of automatic speech recognition systems. Preliminary experiments on understanding two simultaneous speeches show that the proposed challenge problem will be feasible with speech stream segregation. The detailed plan of the research on and benchmark sounds for the proposed challenge problem is also presented.
منابع مشابه
Effects of increasing modalities in understanding three simultaneous speeches with two microphones
This paper reports effects of increasing modalities in understanding three simultaneous speeches with two microphones. This problem is difficult because the beamforming technique adopted for a microphone array needs at least four microphones, and because independent component analysis adopted for blind source separation needs at least three microphones. We investigate four cases; monaural (one ...
متن کاملChallenge Problem for Computational Auditory Scene Analysis: Understanding Three Simultaneous Speeches
Understanding three simultaneous speeches is proposed as a challenge problem to foster arti cial intelligence, speech and sound understanding or recognition, and computational auditory scene analysis research. Automatic speech recognition under noisy environments is attacked by speech enhancement techniques such as noise reduction and speaker adaptation. However, the signal-to-noise ratio of sp...
متن کاملSeparating three simultaneous speeches with two microphones by integrating auditory and visual processing
This paper addresses the problem of automatic recognition of three simultaneous speeches with two microphones, that is, that of sound source separation where the number of sound sources is greater than that of microphones. The approach used is the direction-pass filter, which is implemented by hypothetical reasoning on the interaural phase difference (IPD) and interaural intensity difference (I...
متن کاملImprovement of three simultaneous speech recognition by using AV integration and scattering theory for humanoid
This paper presents improvement of recognition of three simultaneous speeches for a humanoid robot with a pair of microphones. In such situations, sound separation and automatic speech recognition (ASR) of the separated speech are difficult, because the number of simultaneous talkers exceeds that of its microphones, the signal-to-noise ratio is quite low (around -3 dB) and noise is not stable d...
متن کاملCombining Independent Component Analysis and Sound Stream Segregation
This paper reports the issues and results of AI Challenge: \Understanding Three Simultaneous Speeches". First, the issues of the Challenge are revisited. We emphasis the importance of information fusion of various attributes of speeches (sounds) in separating speeches from a mixture of sounds. This emphasis is supported by comparing two methods of speech separation; computational auditory scene...
متن کامل